Exploring Factors that Contribute to Country Development
Intro
This is an R Markdown blog template. This document will be knit to HTML to produce a webpage that will be hosted publicly via GitHub.
Website publication work flow
Edit Rmd
Knit to HTML to view progress. You may need to click “Open in Browser” for some content to show (sometimes content won’t show until you actually push your changes to GitHub and view the published website).
Commit and push changes when you are ready. The website may take a couple minutes to update automatically after the push, but you may need to clear your browser’s cache or view the page in a private/incognito window to see the changes more quickly.
You can include text, code, and output as usual.
Remember to take full advantage of Markdown and follow our Style
Guide.
Examples and additional guidance are provided below.
Take note of the the default code chunk options in the
setup code chunk. For example, unlike the rest of the Rmd
files we worked in this semester, the default code chunk option is
echo = FALSE, so you will need to set
echo = TRUE for any code chunks you would like to display
in the blog. You should be thoughtful and intentional about the code you
choose to display.
Education and the World: Literacy rates, Human Development Index, and their relationship
You can include links using Markdown syntax as shown.
You should include links to relevant sites as you write. You should additionally include a list of references as the end of your blog with full citations (and relevant links).
## `geom_smooth()` using formula = 'y ~ x'
Human Development Index (HDI) and Education
In the world map below, countries are colored according to their Human Development Index score. Each country is assigned an HDI score - a number between 0 and 1, designed, in a rough sense, to measure quality of life. Notice that countries further from the equator are more likely to have a high HDI score than countries closer to the equator. This trend shows up as a visual gradient on the map: the further from the equator, the higher the HDI score, the more blue the countries appear. But this is not a general rule. The term “Gloabl South” is often used to describe a collection of so-called “under-developed” countries near the equator and south of it, a collection which the map below suggests.
However, this map is quite one dimensional. Just what exactly does HDI tell us? What, in concrete terms, does “human development” mean? The goal of the following analysis is to shed light on HDI and its limitations through other measures, in particular measures related to literacy rates and population density.
Literacy Rate and HDI
We begin our inquiry into HDI and education by asking: Which is a better predictor of literacy rates - HDI, or average number of years of education? Moreover, what does it mean if HDI predicts literacy rates better than average number of years of education?
For each country, we can find an expected number of years of schooling; this is the number of years the average student attends school. In countries where the average years of schooling is higher, we expect to find higher average literacy rates.
For each continent, we calculated two correlation coefficients. First, we found the correlation between HDI score and literacy rate; in other words, how well does HDI predict literacy rate for that continent. Second, we found the correlation between average years of education and literacy rate; in other words, how well does years of schooling predict literacy rate for that continent.
Next, for each continent, we found the difference between these two correlations. The interesting results are those where this difference is small. A small difference in these two values means that “development” is as good a predictor of literacy rates as years of education. A small difference indicates that non-educational “developmental” factors are influencing literacy rates.
## Warning in left_join(., Education_Literacy, Education_Literacy_cor, by = c(region = "region")): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 1 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
## Warning in left_join(., HDI_Literacy, HDI_Literacy_cor, by = c(region = "region")): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 1 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
Observe that two continents, South America and Africa, are picked out as
having a smaller difference. This means that in these two continents,
extra-educational factors are influencing literacy rates. This
observation tracks with the delineation into “Global South” and “Global
North” indicated by the plot of HDI. That is, the literacy rates of
South America and Africa, continents situated in the Global South,
suffer from extra-educational factors. Indeed, the Global South consists
of nations which have suffered the effects of capitalist globalization
(https://onlineacademiccommunity.uvic.ca/globalsouthpolitics/2018/08/08/global-south-what-does-it-mean-and-why-use-the-term/).
Among the effects of this phenomenon may be harm to the literary
sitation of effected nations.
One problem with this analysis is that it is not granular. It gives us a view of the world that is split into seven, when in reality, the world has far more than seven borders.
Our next analysis clusters countries according to literacy rate and population density. The goal of the analysis is to show that the division into Global North and Global South is inadequate to understand differences in literacy rates. In other words, the delineation into North and South indicated by HDI is a simplification - the actual situation is more complicated.
Before this analysis can proceed, we first make an observation about the relationship between population density and literacy rates. Compare the plot of Population Density vs. Literacy Rate, with the plot of Log of Population Density vs. Literacy Rate. Observe that a line of best fit on the first plot would be exponential, while in the second, a line of best fit would be linear. This suggests that for the purposes of clustering, it would be appropriate to cluster Log of Population Density against Literacy Rate.
The elbow plot shows that a cluster analysis using three clusters is most appropriate. The plot below associates each country with one of three clusters. The first cluster, 1, consists of countries with high literacy rate and low population density. The second cluster, 2, consists of countries with high literacy rate and high density. The third cluster, 3, consists of countries with low literacy rate. Notice that this third cluster ranges over a wide variety of population densities.
## # A tibble: 3 × 5
## latestRate_scaled density_scaled size withinss cluster
## <dbl> <dbl> <int> <dbl> <fct>
## 1 0.412 -1.07 47 25.9 1
## 2 -1.77 -0.150 33 42.1 2
## 3 0.432 0.612 90 54.4 3
The following map colors each country according to its cluster assignment. What is interesting about this map is that it shows how groups of contiguous countries are likely to fall into the same cluster. What does this mean? As an example, examine the pair of North African countries, Algeria and Libya. These two countries are near the equator, and in our previous analysis, were part of the group described as the Global South.
Whereas HDI assigns a bare number to each country, the map below expresses relationships between a country’s position, its population density, and its literacy rate. Notice the pockets of contries from the same cluster. Countries from a given cluster tend to be surrounded by others from the same cluster.
Algeria and Libya are countries within the same cluster - number 1,
high literacy rate and low population density. Notice that these two
contiguous countries share a common situation with respect to literacy.
And Algeria and Libya are just an example. Throughout the map below,
groups of contiguous countries tend to fall within the same cluster.
This indicates: the assignment of a country to a particular cluster does
not depend only on the circumstances within that country; it depends
also on the broader regional context in which a country is situated.
Even though, in certain places, HDI is as good a predictor of literacy rate as years of education, this analysis obscures the fact that regional factors are at play. It simply is not the case that a single number - weather HDI or “Expected Years of Education” - can capture the whole situation regarding the literacy of a country. The reason for this is made clear by the map above: the situation regarding literacy in one country does not depend only on that country. Thus, numbers examining countries in isolation are largely incapable of expressing the sitation. The map above, in which pockets of similar countries emarge grouped together geographically, testifies to the reality of this interrelationship. HDI is an excellent tool for examining a country in its isolation. But comprehending the literacy situation in a given country, as the map above demonstrates, requires looking beyond the borders of that country.
Visualizations
Visualizations, particularly interactive ones, will be well-received. That said, do not overuse visualizations. You may be better off with one complicated but well-crafted visualization as opposed to many quick-and-dirty plots. Any plots should be well-thought-out, properly labeled, informative, and visually appealing.
If you want to include dynamic visualizations or tables, you should explore your options from packages that are built from htmlwidgets. These htmlwidgets-based packages offer ways to build lighterweight, dynamic visualizations or tables that don’t require an R server to run! A more complete list of packages is available on the linked website, but a short list includes:
- plotly: Interactive graphics with D3
- leaflet: Interactive maps with OpenStreetMap
- dygraphs: Interactive time series visualization
- visNetwork: Network graph visualization vis.js
- sparkline: Small inline charts
- threejs: Interactive 3D graphics
You may embed a published Shiny app in your blog if useful, but be aware that there is a limited window size for embedded objects, which tends to makes the user experience of the app worse relative to a dedicated Shiny app page. Additionally, Shiny apps will go idle after a few minutes and have to be reloaded by the user, which may also affect the user experience.
Any Shiny apps embedded in your blog should be accompanied by the link to the published Shiny app (I did this using a figure caption in the code chunk below, but you don’t have to incorporate the link in this way).
Tables
DT package
The DT package is great for making dynamic tables that can be displayed, searched, and filtered by the user without needing an R server or Shiny app!
Note: you should load any packages you use in the setup
code chunk as usual. The library() functions are shown
below just for demonstration.
library(DT)
mtcars %>%
select(mpg, cyl, hp) %>%
datatable(colnames = c("MPG", "Number of cylinders", "Horsepower"),
filter = 'top',
options = list(pageLength = 10, autoWidth = TRUE))kableExtra package
You can also use kableExtra for customizing HTML tables.
library(kableExtra)
summary(cars) %>%
kbl(col.names = c("Speed", "Distance"),
row.names = FALSE) %>%
kable_styling(bootstrap_options = "striped",
full_width = FALSE) %>%
row_spec(0, bold = TRUE) %>%
column_spec(1:2, width = "1.5in") | Speed | Distance |
|---|---|
| Min. : 4.0 | Min. : 2.00 |
| 1st Qu.:12.0 | 1st Qu.: 26.00 |
| Median :15.0 | Median : 36.00 |
| Mean :15.4 | Mean : 42.98 |
| 3rd Qu.:19.0 | 3rd Qu.: 56.00 |
| Max. :25.0 | Max. :120.00 |
Images
Images and gifs can be displayed using code chunks:
“Safe Space” by artist Kenesha Sneed
This is a figure caption
You may also use Markdown syntax for displaying images as shown below, but code chunks offer easier customization of the image size and alignment.